# load libraries ----
# reporting
library(knitr)
# visualization
library(ggplot2)
library(ggthemes)
library(patchwork)
# data wrangling
library(dplyr)
library(tidyr)
library(readr)
library(skimr)
library(janitor)
library(magrittr)
library(lubridate)
library(glue)
# lake models
library(GLMr)
library(glmtools)
# statistical models
library(tidymodels)
# set other options ----
options(scipen=999)
knitr::opts_chunk$set(
tidy = FALSE,
message = FALSE,
warning = FALSE)32 Teleconnections II
Learning Objectives
After completing this tutorial you should be able to
- describe the concepts of macrosystems and teleconnections and how different ecological processes can interact at local, regional, and global scales.
- set up and run lake models to simulate lake temperatures, thermal structure, and ice cover in multiple lakes.
- formulate hypotheses about the effects of teleconnected climate scenarios on different lake modules and test them using simulations.
- describe how local characteristics modify global-scale climate forcing effects on lake temperatures and ice cover.
Download the directory for this project here, make sure the directory is unzipped and move it to your bi328 directory. You can open the Rproj for this module either by double clicking on it which will launch Rstudio or by opening Rstudio and then using File > Open Project or by clicking on the Rproject icon in the top right of your program window and selecting Open Project.
There should be a file named 32_elnino_II.qmd in that project directory. Use that file to work through this tutorial - you will hand in your rendered (“knitted”) quarto file as your homework assignment. So, first thing in the YAML header, change the author to your name. You will use this quarto document to record your answers. Remember to use comments to annotate your code; at minimum you should have one comment per code set1 you may of course add as many comments as you need to be able to recall what you did. Similarly, take notes in the document as we discuss discussion/reflection questions but make sure that you go back and clean them up for “public consumption”.
1 You should do this whether you are adding code yourself or using code from our manual, even if it isn’t commented in the manual… especially when the code is already included for you, add comments to describe how the function works/what it does as we introduce it during the participatory coding session so you can refer back to it.
Before you get started, let’s make sure to read in all the packages we will need. Double check that everything is installed that you need.
32.1 Create typical El Nino year scenario
Now that we know what ice cover, surface/bottom temperature, and thermal lake structure looks like for the baseline scenario, we are going to explore whether there are teleconnections between El Nino and lake temperatures for two scenarios simulating differing El Nino conditions.
We will use observed historical climate data to determine how much warmer or cooler the local air temperature is likely to be during an El Nino year for our data set.
The data folder contains text files with historical climate data - make sure that you change the code to read in the data for your specific lake. We are going to use data starting in the year 19702.
2 Since the 1970s there has been a change in some of the drivers and processes of how El Nino is formed. So even though overall the effect is still due to a change in the surface temperatures in the Pacific Ocean some of the regional effects have changed.
annual_temp <- read_delim("data/sunapee_clim.txt") %>%
clean_names() %>%
filter(year >= 1970)Let’s use skim() to get a quick look at the data set:
skim(annual_temp)| Name | annual_temp |
| Number of rows | 48 |
| Number of columns | 11 |
| _______________________ | |
| Column type frequency: | |
| character | 5 |
| numeric | 6 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| lake_id | 0 | 1 | 7 | 7 | 0 | 1 | 0 |
| state | 0 | 1 | 13 | 13 | 0 | 1 | 0 |
| station | 0 | 1 | 11 | 11 | 0 | 2 | 0 |
| source | 0 | 1 | 9 | 9 | 0 | 1 | 0 |
| type | 0 | 1 | 6 | 7 | 0 | 3 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| year | 0 | 1.00 | 1993.50 | 14.00 | 1970.0 | 1981.75 | 1993.5 | 2005.25 | 2017.0 | ▇▇▇▇▇ |
| air_temp_mean_c | 1 | 0.98 | 6.68 | 0.73 | 5.3 | 6.10 | 6.5 | 7.30 | 8.4 | ▃▇▃▅▂ |
| air_temp_max_c | 1 | 0.98 | 13.30 | 0.76 | 12.0 | 12.80 | 13.2 | 13.75 | 15.1 | ▃▇▆▃▂ |
| air_temp_min_c | 1 | 0.98 | 0.06 | 0.82 | -1.8 | -0.45 | -0.1 | 0.60 | 1.9 | ▁▅▇▃▂ |
| rain_mm | 1 | 0.98 | 1166.50 | 172.10 | 814.3 | 1010.20 | 1181.0 | 1255.55 | 1583.8 | ▂▇▇▃▂ |
| snow_mm | 3 | 0.94 | 2216.07 | 718.70 | 964.0 | 1774.00 | 2138.0 | 2566.00 | 4602.0 | ▅▇▆▂▁ |
Let’s take a look at differences in Air temperature between El Nino and La Nina years.
Next, we want to determine how much warmer or colder a typical El Nino year is compared to a typical neutral year.
Let’s work through how we can determine what an El Nino year should look like for Lake Sunnapee.
First, let’s check what type of year 2013 is, which is the year we simulated the thermal structure as a baseline. We are going to acquire a new function for our arsenal which is pull(). This allows up to extract a column from a data set as a vector. In this case it is a single value before we will first filter the data set to only contain entries for the year 2013.
annual_temp %>%
filter(year == 2013) %>%
pull(type)[1] "Neutral"
We want to know what the offset is between a linear regression describing neutral and El Nino years. Let’s visualize this by subsetting our data set to contain only neutral and El Nino years and adding a linear trend line.
annual_temp %>%
filter(type %in% c("Neutral", "ElNino")) %>%
ggplot(aes(x = year, y = air_temp_mean_c, color = type)) +
geom_point(shape = 21, size = 3) +
geom_smooth(method = "lm", se = FALSE) +
labs(x = "year", y = "mean annual air temperature [C]")To determine the typical offset, we will use linear regressions to calculate estimated (expected) values for 2013 if it were a typical neutral year and if it were a typical El Nino year. Then we will calculate the difference between those two estimated values as the typical offset to create our met_hourly input data set for a typical El Nino year - i.e we will add the expected offset to the values used in the baseline scenario to create an simulation of the Lake thermal structure if 2013 had been a typical El Nino year.
Let’s start by fitting a linear regression model to the neutral years. For this we need to start by creating a subset of the data containing only neutral years.
neutral_temp <- annual_temp %>%
filter(type == "Neutral")Now, let’s create a linear regression modeling the relationship of the mean annual temperature and the year for the Lake Sunapee using the tidymodels framework.
lm_fit <- linear_reg() %>% # specify model
set_engine("lm") %>% # define computational engine
fit(air_temp_mean_c ~ year, data = neutral_temp) # define variablesLet’s identify the slope and intercept for our linear regression (i.e. the equation describing the relationship of temperature and year).
summary(lm_fit$fit) %>%
tidy()# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -51.9 18.2 -2.86 0.00862
2 year 0.0293 0.00910 3.22 0.00363
We are going to use the function pull() to create variables with our neutral slope and intercept (we are going to need these later).
neutral_slope <- summary(lm_fit$fit) %>%
tidy() %>%
filter(term == "year") %>%
pull(estimate)
neutral_intercept <- summary(lm_fit$fit) %>%
tidy() %>%
filter(term == "(Intercept)") %>%
pull(estimate)We can use the function augment() to create a data.frame that contains the observed data (air_temp_mean_c) for a given year along with the value calculated using the equation for our linear regression (.fitted).
lm_aug <- augment(lm_fit$fit)
head(lm_aug) %>%
kable()| .rownames | air_temp_mean_c | year | .fitted | .resid | .hat | .sigma | .cooksd | .std.resid |
|---|---|---|---|---|---|---|---|---|
| 1 | 6.3 | 1976 | 6.000079 | 0.2999213 | 0.1271902 | 0.6060471 | 0.0210767 | 0.5378358 |
| 2 | 6.4 | 1977 | 6.029405 | 0.3705947 | 0.1183401 | 0.6041533 | 0.0293429 | 0.6612273 |
| 3 | 5.3 | 1978 | 6.058732 | -0.7587320 | 0.1099549 | 0.5862189 | 0.1121352 | -1.3473629 |
| 4 | 6.6 | 1979 | 6.088059 | 0.5119413 | 0.1020346 | 0.5992364 | 0.0465418 | 0.9050916 |
| 5 | 5.8 | 1980 | 6.117385 | -0.3173854 | 0.0945791 | 0.6057531 | 0.0163096 | -0.5588096 |
| 6 | 6.4 | 1981 | 6.146712 | 0.2532879 | 0.0875885 | 0.6072207 | 0.0094726 | 0.4442437 |
We are really only interested in the expected value for the year 2013. Again, we can use the function pull() to extract a single column from a data.frame, which makes it easy to create a variable with our expected mean annual temperature for a “neutral 2013”.
neutral_2013 <- lm_aug %>%
filter(year == 2013) %>%
pull(.fitted)Now, let’s fit a linear regression for an El Nino year.
# create subset with El Nino years
elnino_temp <- annual_temp %>%
filter(type == "ElNino")
# fit linear regression
lm_fit <- linear_reg() %>% # specify model
set_engine("lm") %>% # define computational engine
fit(air_temp_mean_c ~ year, data = elnino_temp) # define variablesFor our neutral data, we were able to use the function augment() because it calculates the expected (.fitted) value for the dependent variable (here, temperature) for every observed value of the independent variable (here, year) in the data set. So since 2013 is a neutral year, it will calculate the neutral expectation for 2013.
Our El Nino data set does not include the year 2013. There for we need to first specify the set of years that we want to plug into the equation describing the relationship of temperature and year for El Nino years.
# create data frame year(s) to predict
predict_year <- expand.grid(year = 2013)
# calculate expected values
pred <- predict(lm_fit, new_data = predict_year)
# extract predicted value
elnino_2013 <- pred %>%
pull(.pred)Now, we can calculate the estimated El Nino offset as the difference between the predicted neutral and typical El Nino temperatures calculated for 2013 using the linear regression equations.
typical_offset <- elnino_2013 - neutral_2013
LakeName <- "Sunapee"
saveRDS(typical_offset, file = glue("results/{LakeName}_typical-offset"))We’re almost there - now all we have to do is create the met_hourly input file for the scenario we want to simulate. In your sim_lakes folder create a new folder called Sunapee_typical3, make a copy of the the outflow.csv, inflow.csv, and glm2.nml files from the Sunapee folder4.
3 remember to change this according to the lake you are simulating
4 Again, change this so you are adding a copy of the corresponding lake nml file to the lake you are working on.
# specify your lake
LakeName <- "Sunapee"
# specify your lake simulation for observed data
sim_folder <- file.path("sim_lakes", LakeName)
# specify new sim folder for El Nino year
sim_folder_typical <- glue("sim_lakes/{LakeName}_typical")Now, let’s create a new met_hourly file for our “typical El Nino year” scenario. We are going to change the air temperature by increasing it according to the offset between neutral and El Nino years we just calculated. This means we can do this completely in R, by reading in the baseline met data and the using mutate to add the offset to the Air temperature and writing it back out to file.
# file path for original met file
baseline_met <- file.path(sim_folder, "met_hourly.csv")
# read in baseline met data
met_data <- read_delim(baseline_met, delim = ",") %>%
mutate(AirTemp = AirTemp + typical_offset, # add El Nino offset
time = as.POSIXct(strptime(time, "%Y-%m-%d %H:%M:%S", tz="EST"))) # make sure correct date/time format
# write new met data to new sim folder
write_delim(met_data, file.path(sim_folder_typical, "met_hourly_typical.csv"), delim = ",")Next, we need to edit the nml file to make sure that it points to the correct met file. You can do this by either opening the nml file in an external text editor or, depending on your Rstudio version you can also clicking on the glm2.nml file in your Sunapee_typical in the Rstudio File tab and select View File which will open it in the Viewer panel.
You will need to scroll to the meteorology section5 and change the meteo_fl entry to match the new file name (met_hourly_typical.csv)6.
5 You can also use the find function!
6 Make sure to save the file before closing it!
Let’s double check that we did this correctly:
# define file path
nml_file_typical <- file.path(sim_folder_typical, "glm2.nml")
# read nml file
nml <- read_nml(nml_file_typical)
# check met file name
get_nml_value(nml, "meteo_fl")[1] "met_hourly_typical.csv"
If the print out in the console is not your met file for the typical El Nino scenario you just created, make the edit to the nml file and make sure to save the change!
32.2 Create an extreme El Nino year scenario
We created a typical El Nino year scenario. From comparing the annual mean temperatures in our historical climate data set, you know that there is strong variation in how much El Nino years differ from a neutral year.
Let’s identify the largest observed offset that our lake has experienced since 1970. Let’s take another look at our observed annual temperature data for neutral and El Nino years.
annual_temp %>%
filter(!type == "LaNina") %>%
ggplot(aes(x = year, y = air_temp_mean_c, color = type)) +
geom_smooth(method = "lm", se = FALSE) +
geom_point(size = 3) +
scale_color_manual(values = c("darkorange", "darkblue")) +
labs(x = "year", y = "mean annual air temp [C]")We could eyeball which of the El Nino years appears to have the largest offset from a neutral year, or since we calculated the slope and intercept of the regression line describing the relationship of temperature and time for neutral years we can calculate the offsets for all El Nino years and then identify the largest offset.
We can do this by calculating the estimated temperature for each year if it was a neutral year, and then calculating the offset.
# identify maximum offset
max_offset <- elnino_temp %>%
mutate(neutral_est = (neutral_slope * year) + neutral_intercept,
offset = air_temp_mean_c - neutral_est) %>%
summarize(max_offset = max(offset)) %>%
pull(max_offset)
saveRDS(max_offset, file = glue("results/{LakeName}_max-offset"))Now, we can use this value to generate a strong El Nino scenario repeating the steps from above.
First, create a new subfolder sim_lakes/Sunapee_extreme7. Then place a copy of the met_hourly.csv, inflow.csv, outflow.csv, and glm2.nml file in that folder.
7 Remember, that you should be naming this folder according to the lake you are exploring.
Next, let’s create our new met file by adding the maximum offset we identified in the observed data to the air temperature.
# create file pat for extreme el nino simulation
sim_folder_extreme <- "sim_lakes/Sunapee_extreme"
# read in baseline met data & add offset
met_data <- read_delim(baseline_met, delim = ",") %>%
mutate(AirTemp = AirTemp + max_offset, # add max El Nino offset
time = as.POSIXct(strptime(time, "%Y-%m-%d %H:%M:%S", tz="EST"))) # make sure correct date/time format
# write new met data to new sim folder
write_delim(met_data, file.path(sim_folder_extreme, "met_hourly_extreme.csv"), delim = ",")Finally, we need to edit the nml file for our scenario, so it points to the correct met file. Open the glm2.nml file in the Sunapee_extreme folder8 and edit the meteo_fl option to point toward your met file.
8 Remember, this will be specific to the lake you’ve chosen.
Let’s double check that we edited the file correctly.
# define file path
nml_file_extreme <- file.path(sim_folder_extreme, "glm2.nml")
# read nml file
nml <- read_nml(nml_file_extreme)
# check met file name
get_nml_value(nml, "meteo_fl")[1] "met_hourly_extreme.csv"
32.3 Acknowledgments
These activities are based on the EDDIE Teleconnections module.9
9 Farrell, K.J. and C.C. Carey. 18 May 2018. Macrosystems EDDIE: Teleconnections. Macrosystems EDDIE Module 3, Version 1.